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Abstract 

In a world with frictions, the "worthwhile-to-move" incremental principle is a mech- 
anism where, at each step, the agent, before moving and after exploration around the 
current state, compares intermediate advantages and costs to change to advantages and 
costs to stay. Theses advantages and costs to change are not the usual benefit and cost 
functions. They are behavioral, including goal setting, psychological, cognitive (learn- 
ing) and inertia aspects. The agent is supposed to have a long-term goal and limited 
needs. At each step, the agent chooses between repeating the action or changing it. 
Acceptable moves are such that "advantages to move than to stay," are higher than 
some fraction of "costs to move than to stay," with, as a result, a limitation of the 
intermediate sacrifices to reach the goal. The transition process is made of a punctu- 
ated succession of static exploration-exploitation phases, and dynamic moving phases. 
This "dynamic and reference dependent" incremental cost-benefit behavior improves 
and leads to local actions in an endogenous way. It converges if the agent has high 
enough local costs to move, because of an entropy property. 

When the agent is more goal-oriented and wants to "improve enough" at each step, 
the process shrinks (the diameter of the state-dependent "worthwhile-to-move" set 
decreases to zero), and reduces to one state if the goal function is upper semicontinuous. 
The process ends in a permanent routine, a rest point (which may be lower than any 
local maximum) where the agent prefers to stay than to change, in spite of some possible 
residual frustration to have missed his goal. The convergence in a finite number of steps 
occurs if the agent chooses a finite total exploitation time. 

The model describes a full range of behaviors. Psychology and cognition play an 
important role in the balance between motivation and fear to change. Links with op- 
timization theory, variational inequalities (Ekeland's variational principle) are given. 
The "as if hypothesis" (as if agents would optimize) is revisited, leading to the intro- 
duction of a new class of optimization algorithms with inertia, called the "local search 
and proximal algorithms". 

1 Introduction: From Incrementalism to Instrumental- 
ism to Optimization 

In a world with frictions, the "worthwhile-to-move " incremental principle allows us to handle 
the extreme but very realistic case of a "muddling through" decision-making process where 
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the agent at each step tries to do a httle better than before (Lindblom, 1959). By adding an 
instrumental goal-setting principle, the "improving enough" principle, this model handles a 
great varic^ty of intermediate goal-setting behaviors, allowing us to use inertia and explaining 
the formation of routines. Satisficing (Simon, 1956) is the most noteworthy example. The 
traditional case of global optimization is at the other extreme of this range of behaviors, in 
a world deprived of friction. 

The term "agent" is taken in a broad sense, it may represent a single agent, a problem 
solver, or a group whose structure remains identical during the process. We do not examine 
here the case of interacting agents with strategic features. 

Behavioral imperfections are the rule, not the exception. This is true for human behav- 
iors, for decision making, and for problems solving. The world is full of frictions. Human 
beings have limited physical resources, bounded cognitive abilities, psychological biaises 
which distort their evaluation and their motivations to search. Human preferences and 
beliefs may be inconsistent; motivations to act can be unconscious, vague, too low, or im- 
pulsive; goals are frequently ill-defined; knowledge may be inadequate. Agents cannot take 
advantage of any piece of information they receive. They ignore a part of them. These 
imperfections generate costs to try to remedy. Group members have non congruent goals 
and partial conflicts of interest, they lose time to bargain and resolve conflicts. One can 
distinguish three kinds of imperfection: 

i) Cognitive imperfections: lack of knowledge is one limit for decision making; 

ii) Psychological imperfections: they are the main source of problems in driving action 
properly. They concern psychological biais of evaluation, editing, goal setting, motivation 
building, dealing with frustration feelings, unclear goals, impulsive behaviors, emotion biais. 
We can mention reference-dependent biais, anchoring effects, framing effects, bracketing, 
narrow or broad mental accounting, salient effects, hyperbolic discounting, impatience, re- 
gret, negative feelings for ambiguity, closure effects, inhibition and excitation, the role of 
stress and emotions; 

iii) Physical and physiological imperfections: inertia and frictions (Rumelt, 1990) are the 
main problems for action. 

Lack of knowledge was well documented, but goal setting and inertia, frictions, and costs 
to change not as much. We first model the imperfections related to inertia and frictions. 

Because imperfections are everywhere, most of our behaviors are incremental. They work 
step by step, using small improving steps and local improving actions, local exploration 
devices, trials and errors. Incremental behaviors characterize low goal-oriented "decision- 
making" where, at each step, the agent compares local advantages and costs to move to 
advantages and costs to stay. These incremental behaviors improve in time. Practical 
examples of behavioral "local cost-benefit processes" are "pros and cons lists" and "plus- 
minus arguments." Franklin (1978) urges agents to list "strengths and weaknesses" to help 
decision-making. In political science, "muddling through" behaviors (Lindblom, 1959) de- 
scribe administrative and political decision-making processes. Agents and organizations are 
never supposed to optimize. 

Incremental behaviors are state-dependent when choices are sequential, restricting the 
choice set by successive eliminations. In most cases, choosing to do something is anchored in 
previous choices because choosing to change entails choosing not to stay. Choosing is rarely 
between new options, but between a new and an old one (an anchoring effect of successive 
comparisons) : to stay or to move, to stop doing something or to carry on, to buy the same 
good as before or not. Most behaviors are reference-dependent, including anchoring, mental 
accounting, and bracketing. 

Repeated choice matters as a step in a dynamic process. This leads to the concept of 
temporary routines where the agent repeats the same choice and changes from time to time, 
moving from a temporary routine to a new one, to finally end in a permanent routine. 

At one extreme there are incremental behaviors, at the other traditional global opti- 



2 



mizatioii. In this ideal case, the agent discovers the whole state space of alternatives in 
one preliminary hidden step, before, in one shot, pairwise comparing alternatives. This 

represents a static and global cost-benefit analysis. 

Our purpose is to pave the way between these extremes, from incremental and low goal- 
oriented behaviors to increasingly goal-oriented (instrumental) behaviors, up to optimizing 
behaviors. We introduce intermediate goal setting "decision-making" processes, such as 
intermediate satisficing (improving enough) where goals change along the adaptative process. 
They vary with motivation and depend on goal-setting costs. Intermediate payoffs become 
endogenous to the course of "decisions followed by actions" and cannot be cumulated ex 
ante, as this is done in a substantive dynamical model. 

Assume that, out of ignorance, the agent does not particularly optimize (he does neither 
know the underlying state space nor his utility function ex ante). At each step, he must 
explore around the current state to discover his local environment and possibly improve his 
performance. This is a local search optimizing process (hill climbing). If the agent wants to 
improve his motivation to try to "improve more" and "more quickly" at each step, he can set 
intermediate goals (intermediate aspiration and intermediate satisficing levels) . These goals 
help him to drive his intermediate exploration process to "explore enough" but "not too 
much" around to be able to "improve enough" at each step. This is the "gradual satisficing" 
process of Soubeyran (2006), Martinez-Legaz and Soubeyran (2002), without inertia and 
frictions. 

In the presence of inertia ("costs to move"), the agent must do more to sustain his 
intermediate motivation: he must "improve even more" at each step to be able to compensate 
intermediate "costs to move" by intermediate "advantages to move" . Then, it is "worthwhile 
to move" . This is a way to limit intermediate sacrifices. 

Our modelling of an incremental decision-making involves three interrelated blocks: 

1) a motivation building and goal-setting block where, at each step, the agent tries to 
"improve enough" (improving or satisficing) his per unit of time payoff g{y) G M generated 
by each action y & X. The agent solves, at each step, a qualitative inequality 

yel[x, e{x)] ={yeX: g{y) - g{x) > s{x) >0}cX (1) 

where the temporary satisficing gap is e{x) > 0. 

2) a learning block, where, at each step, the agent tries to find a satisfactory action. This 
requires to evaluate the per unit of time payoff g{y) around the present action x G X. At 
each step, the agent solves the qualitative inclusion 

y&E[x,r{x)]cX (2) 

with r{x) equal to the radius of the exploration set around x C X. 

3) a transition "worthwhile-to-move" block where, at each step, the agent considers to 
move or to stay: acceptable changes are such that the estimated behavioral advantages to 
move A{x,y) are higher than some fraction 1 > ^(a;) > of the estimated behavioral costs 
to move C{x, y) > 0. The agent solves a qualitative "worthwhile-to-move" inequality 

y G W{x) = {yeX: A{x, y) > e(x)C(x, y)] . (3) 
Each of these blocks contains three aspects: 

i) At each step, heuristics and qualitative tools are used by the agent in order to estimate 
advantages and costs; 

ii) At each step, punctuated dynamical aspects enter into the description of the articu- 
lation between static phases of exploration-exploitation and dynamic moving phases linking 
two consecutive temporary routines. 
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iii) Behavioral aspects include physical, physiological, psychological, cognitive, and social 
features. 

Rumclt (1990) provides an excellent literary presentation approach of inertia. Conlisk 
(1996) emphasizes the importance of deliberation costs. Lipman and Wang (2000, 2006) 
introduce costs to change actions in the theory of non adaptive repeated games with perfect 
rationality (agents optimize a long term objective, a mean of cumulated payoffs). Our 
formalization is far more general, using the concept of distance as a dissimilarity index. We 
model: 

• Adaptive processes of decision-making such as "muddling through" (Lindblom, 1959) 

and "satisficing process" (Simon, 1955) which limit intermediate sacrifices necessary to 
reach the final moving goal. Such "worthwhile-to-move" processes adapt more or less 
to inertia during the transition from the beginning to the end. To our knowledge, there 
exists no formal model of "muddling through." For a dynamic model of satisficing, see 
Selten (1998); for a static model of satisficing, see Tyson (2005). 

• Qualitative heuristics of search and exploration, in a topological context, using an 
enclosing principle (using inequalities rather than equalities, putting bounds on control 
variables) (Gigerenzer and Todd, 1999; Gilovitch et al., 2002). 

• Path dependency and lock-in effects. We show that "muddling through" behaviors are 
path dependent because they have reference-dependent payoffs, and make small steps 
because they converge, due to high costs of change and bounded needs. The closest 
model to ours is the local search hill climbing algorithm (Aarts and Lenstra, 2003) 
which at each step improves by local exploration but which ignores inertia. 

• Habits and routines as the outcomes of habituation and routinization of temporary 
habits and routines which converge to permanent habits and routines. Our model 
shows when "temporary satisficing" behaviors converge to a "permanent" routine (lo- 
cal maximum, attractor, fixed point). This occurs indeed when the agent is motivated 
and "improves enough" at each step. For optimization of habits and addiction see 
Abel (1990) and Carroll (2001), for habits and procrastination see O'Donoghue and 
Rabin (1999), at the organizational level of routines see Nelson and Winter (1997). 

We shall also calibrate the dynamic inefficiency of such adaptive processes to know how 
far from optimization agents behave, show how to overcome inertia and prove the "(epsilon)- 
Variational principle" of Ekeland (Attouch and Soubeyran, 2006). It leads us to revisit the 
"as-if-hypothcsis" where, as in economics, agents optimize, and put to the fore the "local 
search and proximal algorithms" . These algorithms, which involve inertia features, are a 
mixture of local search algorithms and proximal algorithms. Moreover, it provides a general 
framework and a large field of applications for proximal algorithms (Attouch and Bolte, 
2006; Attouch et al., 2007; Attouch and TebouUe, 2004). Connections with second order 
dynamic optimization models with memory can be made (Attouch et al., 2000; Attouch and 
Soubeyran, 2006; and, for long memory effects, Goudou and Mimier, 2005). 

In section 2, we consider incremental worthwhile-to-move behaviors used to bound in- 
termediate sacrifices. In section 3, we emphazise the inertia context with intermediate costs 
to move. A typology of costs to move is introduced. In section 4, we detail the goal-setting 
context with intermediate advantages to move. In section 5, we model the punctuated 
exploration-exploitation and moving process of decision-making. In section 6, we examine 
behavioral dynamics with not too many intermediate sacrifices. In section 7, we consider 
incrementalism through enclosing. In section 8, we state the "worthwhile-to-move" theorem 
which gives general conditions for the convergence of the process toward a permanent rou- 
tine. In section 9, we revisit the "as-if-hypothesis" by introducing inertia aspects and put 
to the fore the role of "local search proximal algorithms." 
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2 Incrementalism: "Worthwhile-to-Move" Behaviors 
Bound Intermediate Sacrifices 



2.1 The State-Dependent Balance between Intermediate Advain- 

tages and Costs to Move 

We model "muddling through" behaviors (making small steps, improving step by step, 
Lindblom, 1959) comparing advantages and costs to change at each step. 

The simplest way to decide what to do between two possibilities is to compare advantages 
and costs. The point is that most decisions are state-dependent, part of a dynamic decision- 
making process. They represent intermediate decisions to reach well defined goals. 

Consider the agent a.t x G X who has some motivation to change. He wants to move from 
X € X to some y € X. He explores around x G X, in an exploration set E {x, r{x)) of size 
r{x) > 0, to estimate and compare state-dependent intermediate advantages A{x,y) € M. 
and costs C(x,y) e M"*" to move from x to y. The size of an exploration set E{x) can be 
its radius r{x) = sup {d{x,y), y e E{x)} . To simplify, we consider only scalar advantages 
and costs. They are reference-dependent, the reference being, at each step, the state x of 
departure, then y, and so on. The case of multidimensional advantages and costs to move 
A{x, y) e V, C{x, y) &V was examined in Soubeyran and Soubeyran (2004 in a Riesz space 
V, revised version 2007 in a semi- group V). 

According to the "worthwhilc-to-move" principle, an acceptable move is such that the 
estimated advantages are higher than some proportion 1 > £^{x) > of his estimated costs: 

A{x,y)>^{x)C{x,y). (4) 

A move satisfies the "worthwhile-to-move" principle if and only if it satisfies this inequality. 
The sacrificing rate is 1 — ^(x), the portion of the costs to move which the agent does not 
put in the balance is (1 — i{x)) C{x, y). The non sacrificing rate is £,{x). 
The ^^worthwhile-to-move" set at a; G X is defined by 

W{x) = {yeX: A{x, y) > C(x)C(x, y)} C X. (5) 

This defines a "worthwhile-to-move" set- valued relationship x G X i — > W{x) C X. We 
assume that state-dependent advantages and costs to move are zero if the agent stays at 
X € X : A{x, x) = C{x,x) =0 for all x € X. Thus x € W{x) for all x € X. We also assume 
that C{x, y) > for y 7^ x. 

We reformulate this relationship by using a standard goal function g : x G X i — > g{x) G M. 
The real number g{x) represents the instantaneous utility of the agent at state x. Starting 
from a known couple {x,g{x)) G X x R, the agent does not know the values g{y) of his 
utility function, for states y ^ x, without exploration around x. Intermediate advantages to 
move are 

A{x,y)=t{y){g{y)-g{x)) (6) 

where t{y) > is the length of time during which the agent can benefit or choose to exploit 
his instantaneous advantages to move g{y) — g{x) > 0. This length of time t{y) defines his 
intermediate exploitation. Because costs to move are non negative a "worthwhile-to-move" 
choice y G W{x) improves: y G W{x) g{y) > g{x). Costs to move can be zero, in absence 
of friction. In this case of no inertia, if the exploitation time t{y) > is strictly positive, the 
worthwhile-to-move relationship reduces to improving: y e W{x) g{y) > g{x). 
We consider three kinds of inefficiencies: "lack of motivation," "lack of knowledge," and 
"high local costs to move (inertia)" which we will model with the help of three blocks 
and their interrelations: "goal setting," "exploitation-exploration," and "moving." We first 
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examine the moving block, consisting of acceptable transitions where sacrifices are not too 
high. 

Intermediate sacrifices are maximal when the non sacrificing ratio ^{x) = is zero. Then, 
the agent bears costs to move, but ignores them during the transition. In this case, if the 
exploitation time is strictly positive, the worthwhile-to-move relationship reduces once again 
to improving, y G W{x) g{y) > g{x) from A{x,y) = t{y) {g{y) ~ g{x)) > ^{x)C{x,y) = 
and t{y) > => g{y) > g{x). 

When < ^{x) < 1, the agent ignores the costs to move, or intermediate sacrifices 
{1 — £,{x)) C{x,y). In this case, the ratio "advantages over costs to move" A{x,y)/C{x,y) 
must be higher than ^(x) < 1. 

When ^(x) > 1, the agent is much more demanding, he refuses intermediate sacrifices, 
and requires to make an extra gain A{x,y) — ^{x)C{x,y) > 0. In this case, the ratio "ad- 
vantages over costs to move" A{x,y)/C{x,y) must be higher than ^{x) > 1. 

In each case, the "worthwhile-to-move" condition is a reference state-dependent satisfic- 
ing condition : A{x,y) /C{x,y) > ^{x) > for y ^ x and C{x,y) > 0. The "worthwhile-to- 
move" definition ^ can be equivalently formulated as 

W{x) ^{yeX: y)/C{x, y) > ^x) > if y ^ x} U {x} . (7) 

To simplify this initial presentation of the "worthwhile-to-move" principle, take the inter- 
mediate exploitation time t{y) = T as a given constant. We will remove this simplification 
very soon. 

The "worthwhile-to-move" principle defines an acceptable transition process Xn+i G 
W{xn), n e N. At each step, when moving from Xn to Xn+i, intermediate sacrifices are 
not too high, advantages to move A(a;„,a;„+i) are greater than some fraction > 

of costs to move C{xn, Xn+i)- At each step the agent improves his goal from g{xn) to 
g{xn+i) > g{x„), n e N. 

If the agent wishes to move toward a final goal, and if costs to move are non negligible, the 
agent must, at each step, compensate local costs to move C{x, y) by some local advantages 
to move A(x,y). This defines an acceptable transition process. 

The "worthwhile-to-move" principle governs a lot of weakly goal-oriented behaviors, 
where, at each step, an agent, starting from the current state, tries both to improve, and 
to balance local advantages to move to a minimal fraction of costs to move. We will show 
that the course of choices and actions of this local "'behavioral cost-benefit" decision-making 
is localizable, because it is nested. But it does not necessarily shrink to a point. Such a 
process is incremental, reference and path-dependent. This "worthwhile-to-move" process 
characterizes a behavioral "cost-benefit" analysis and a "muddling through" process of ad- 
ministrative behavior (Lindblom, 1959), where frictions and inertia play a major role. This 
incremental decision-making describes a "mental accounting" behavior. 

Consider an agent who, at state a;, benefits of the instantaneous utility g{x) while his 
aspiration level at x is higher: 'g(x) > g{x). While staying at x ^ X, he feels the frustration 
f{x) = ^{x) {cj{x) — g(x)) > generated by the per unit of time unsatisfied need n{x) = 
g{x) — g{x) > 0. The weight ^{x) > represents his per unit frustration feeling. This 
pushes him to change for a new state y G X with a better instantaneous utility g{y) > g{x). 
This is incrementalism where an agent tries to gradually improve his instantaneous utility. 
This is the case of a goal-oriented agent who only wants to reduce his frustration feeling by 
improving. He will either explore in a given exploration set or enlarge his exploration set 
step by step. In both cases, he will stop the exploration around x, in the exploration set 
E (x, r(x)) C X as soon as he finds a new state y ^ E (x, r{x)) which improves, g{y) > g(x). 

This is the no-friction case with no cost to move. In the friction case, the agent must com- 
pensate the costs to move by high enough advantages to move. He must improve enough to 
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limit the intermediate sacrifices of moving. This defines a "worthwhile-to-move" constraint 
during the transition, y G W{x), which is expressed as 

T{g{y)-g{x))>0{x,y)d{x,y) (8) 

where d is a distance on X, the intermediate exploitation time t{y) — T is taken as a given 
constant. In our simple case, we take the ratio 9{x,y) higher than some constant 9 > 0. 

As said before, incrementalism in terms of goal improvement is when an agent tries to 
improve his per unit of time utility step by step. A world of frictions adds "costs to move." 
There is a need to "improve enough" at each step in order to limit intermediate sacrifices 
coming from costs to move. We will show that such "improving enough" process generates 
local actions, small steps, local "worthwhile-to-move" transitions toward a new improving 
state. These improving transitions must be acceptable, in the sense that they present not 
"too many sacrifices" in the short run. In this case, exploration is not goal-oriented. The 
agent learns by doing. 

A worthwhile-to-move behavior usually requires more than improving g{y) — g{x) > 0. 
This is "improving enough", g{y) — g(x) > e{x) > 0, for some given and feasible gap of 
improvement e{x) > 0, to compensate for moving costs. This requires both T {g{y) — g{x)) > 
e{x) > and e{x) > 9{x,y)d{x,y). This remark will help us later to make the link between 
a worthwhilc-to-move process and an intermediate satisficing process. 

2.2 Local Actions and Convergence 

We show that an incremental "worthwhile-to-move" behavior has the "local action" property, 
and converges (both goals and states). Local action is a consequence, not an hypothesis. 

The state space X is a metric space with distance d. Take the exploitation time T = 1, 
the "worthwhile-to-move" inclusion ^ becomes 

W{x) ^{yeX: g{y) - g{x) > 9{x, y)d{x, y)} C X. 

We suppose that the agent has enclosed the "worthwhile-to-move" inclusion y G W(x) 
within the inclusion S{x) 3 W{x) in the following way. The agent can control the process 
step by step, in such a way that, for all x,y ^ X, 9{x, y) > 9 > 0. Let 

S{x) ^{yeX: g{y) - g{x) > 9d{x, y)} . 

With xq ^ X being given, the "worthwhile-to-move" process Xn W{xn), n e N, is 
enclosed in the intermediate satisficing process x„ S{xn), n G N. 

The relationship S defined by xSy <^ y G >5'(a:) is refiexive and transitive: 

i) X G S{x) for all a; G X is a consequence of d{x, x) — for all x G X. 

ii) y G S{x) and z G S{y) implies z G S{x). This is a consequence of the triangle 
inequality: d{x, z) < d{x, y) + d{y, z). 

Indeed, from g{y) — g{x) > 9d{x,y) and g{z) — g{y) > 9d{y,z), by adding the two 

inequalities, g{z) — g{x) > 9d{x, z). 

Hence the enclosing process is nested: S{xa) D S{xi) D .... D S{xn) 3 S{xn+i) ^ ■•■ 
This allows the agent to gradually improve the localization of the "worthwhile-to-move" 

process: Let {xn+i G W(xn), n G N} with xq G X he a "worthwhile-to-move" process which 

can be enclosed, using the enclosing heuristic 

W{x) C S{x) = {yeX: g{y) - g{x) > 9d{x, y)} for all x£X. 
Then the "worthwhile-to-move" process is nested. 

The agent has limited needs, his utility function g : x G X g{x) G R is upper 
bounded, set g = sup^^x di^) < +oo. Assume that the costs to move are high enough 



7 



locally: C{x,y) > dd{x,y), 9 > 0. Then, during the enclosing process x„ S{xn), n e N 
with xq X given, we have the succession of inequalities 

g{Xn+l) - g{Xn) > Od{Xn,Xn+l), n G N. (9) 

Let us state the simple but fundamental local action property: 

The Local Action Proposition: // the agent has limited needs, high local costs to 
move, and if he accepts limited intermediate sacrifices, by enclosing, this implies local actions 
and convergent goals: 

d{xn,Xn+i)^0 as n ^ +00 and — > .g* < g < +00. 

Proof: By inequality ([9]), the sequence of goals {g{xn), n £ N} is increasing (by definition 
of a distance, d(Xn,Xn+i) is nonnegative). As needs are limited, < +c»), this sequence 
converges to some limit g* <g < +00. It follows from Eq.© that the distance between two 
successive states tends to zero, d{xm a;„+i) — > as n ^ +00. 

Assuming that the state space is complete, we obtain the convergence property: 

The Convergence Proposition: // the state space X is a complete metric space, if 
the per unit of time utility function g{.) is upper bounded, if the worthwile to move process 
Xn W{xn), n € N is enclosed within the enclosing process x„ — > S{xn), n € N with 
xq ^ X given and high enough local costs to move, then, the "worthwhile-to-move" process 
converges toward some final state, a;„ — > x* as n +00. 

Proof: By adding all the inequalities ^ from n = to m, using the subadditivity of 
the distance function, and the fact that the instantaneous utility function is upper-bounded 
{g < -|-c»), one obtains 

m 

+ oo>g- 5(2:0) > g{xm+i) - gixo) > ^' ^ d(x„, a;„+i). (10) 

n=0 

As a consequence, the series J2n=o^i^n-^^ri+i) < +00 is convergent. The "worthwhile- 
to-move" sequence (a;„)„gpf is a Cauchy sequence which converges to some state £ A in 
the complete metric space X. 

2.3 From Rest Points to Behavioral Rest Points 

A performance a;* G A is said to be a rest point if, for any y G A with y ^ x*, it is not 
worthwhile to move from x* to y. This is equivalent to say that x* G A is a rest point of 
the "worthwhile-to-move" relationship x G A 1 — > W{x) C A, W{x*) = {x*} . In this case, 
the agent has no further incentive to move. Thus, a;* G A is a rest point iff for any y ^ x* , 
A{x* ,y) < ^{x*)C{x* ,y). In our simplified example, intermediate advantages to move are 
proportional to per unit of time advantages to move, A{x, y) =T {g{y) — g{x)) = g{y)—g{x). 
If the exploitation time is given, T = 1, and costs to move are proportional to the distance 
of moving, C{x, y) = 9d{x, y) >0, 9 > 0, then a;* G A is a rest point if, for any y ^ x* , 

g{y)~g{x*) < 9d{x*,y). 
This property defines a rest point in a strong sense. 

A weak rest point x* G A is such that g{y) — g{x*) < 9d{x*,y), for all y G A, with 
6 > 0. li 9 = 0, a weak rest point is a global maximum, while a strong rest point is the 
unique global maximum of the function g{.). The traditional optimization problem to solve 
suPa;gx 9{^) is a particular case. 

A performance a:* G A is said to be a behavioral rest point" with respect to a given 
initial data x^ G A, if it is a rest point, and starting from the given state G A, it can 
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be reached, following a "worthwhile-to-move" (acceptable with not too many intermediate 
sacrifices) transition process x„ — W{xn), n S N. 

Substantive rationality considers only the case of a rest point x* <E X , more precisely a 
global or a local maximum. If the agent happens to be there, he prefers not to deviate. 

The Ky Fan theorem (Fan, 1972) gives conditions over the net incremental gain "to move 
instead of to stay" A(x, y) = A{x, y) — ^(x)C(x, y) which guarantee the existence of a rest 
point. The Ky Fan conditions are (Singh et al., 1997: 137): Let X C T be a non empty 
convex set in a topological vector space T. Let A : (a;, y) € X x X i — > A(a;, y) € be such 
that: 

1. for each x & X, A{x, y) is a quasi-concave function of y G X, 

2. for each y G X, A(a;, y) is a lower semicontinuous function of a; G X, 

3. A(a;, x)<Q for all a; G X, 

4. X is compact. 

Then there exists a point x* G X such that A(a;*, y) < for all y G X. 

But this story does not tell us why the agent is there, how he has reached such a rest 
point. Procedural rationality considers the much more convincing case of a behavioral rest 
point. The story tells us where the agent starts from, and, step by step, which acceptable 
paths he is supposed to follow to reach or to be locked in a behavioral rest point. Procedural 
rationality examines which kind of process converges in a rest point, starting from any initial 
position xq G X, do rest points exist, and does such a process converge in finite time? In the 
next sections, we show that, under some mild conditions, the "worthwhile-to-move" process 
X 6 X I — > W{x) = {y G X, A(x, y) > 0} C X converges to a rest point. Convergence in 
finite time requires more. 

Concrete examples of behavioral rest points are daily life behaviors such as routines, 
habits, rules, norms, practices, lock-in effects. A behavioral rest point can be inefficient 
because it is constrained by a transition path which ends to it. How far from the supremum 
does an improving process lead? The ineflaciency gap of such a behavior will be measured 
by the difference Ij — g* > between the supremum p < +oo of g{.) over the state space X 
and the limit g* = lim^^+oo g{xn) < 5 of the improving process. 

3 The Inertia Context: "Intermediate Costs to Move" 

We have first examined the balance between behavioral advantages and costs to move, to 
explain how incremental behaviors (muddling through processes) emerge (converge, hence 
making small steps). Before examining more goal-oriented behaviors, we examine the two 
terms of the balance, behavioral advantages and costs to move (which are not the usual ben- 
efit and cost functions) including goal-setting, cognitive, psychological, and inertial aspects. 
In this section we first examine costs to move. 

3.1 Frictions and Inertia 

Management sciences consider distorted perception, dulled motivation, failed creative re- 
sponse, political deadlocks, and action disconnecting (Rumelt, 1990). The term inertia is 
used for each of these frictions. 

• Distorted perception comes from myopia, (the unability to forecast the future with 
clarity), and from denial (a defensive behavior, which is the rejection of information 
which is contrary to what is desired or what is believed to be true). Denial may stem 
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from hubris or from fear. Information filtering rejects information wliich is unpopu- 
lar, unpleasant, or contrary to doctrine. Grooved thinking rejects information which 
deviates too much from common wisdom. 

• Dulled motivation describes the lack of sufficient motivation to change, due to the 

abandonment of costly sunk specific investments. 

• Failed creative response concerns the difficulty to choose a direction because of the 
complexity of the choice, the speed of change, inhibition and non-reactive state of 
mind, or inadequate strategic vision. 

• Political deadlocks come from the three main sources of disagreement among agents: 
difference in personal interest, in beliefs, and in values. 

• Action disconnecting comes from lack of vision, lack of leadership, attachment to statu 
quo, embedding in routines where changing one part of the process requires to change 
a lot of other parts. 

To capture most of these frictions we define 

• i) Motivation building, goal-setting, and exploitation (doing, know-how). 

• ii) Exploration (information seeking, concept acquisition). 

• iii) Moving (changing, learning). 

These three kinds of frictions introduce reference-dependent effects on goals and costs. 
The terminology of inertia is limited to the moving block, to costs to move more or less 
quickly. 

3.2 Costs to move and costs to stay 

Costs to move can be cither physical or behavioral. Costs to stay and dynamical costs to 
move can be classified into: costs linked to a task (costs to start, to progress, to stop), costs 
to switch between tasks, and costs of interactions between agents. 

1. Costs to Improve the Way of Doing a Task (Intra Costs): 

• Motivation costs are costs to set a goal or strive for it, to set an aspiration (longing) 
level, a satisficing level, and to adjust the longing level. Excitation costs are often 
necessary to start some novel action. Emotions drive the goal setting process. 

• Physical and physiological dynamic costs can be fixed costs to start an action, such as 
training, preliminary, or warming-up costs to repeat a task, or costs to initiate a new 
way of doing a task. There are also costs to stop an action. Dynamic costs to change 
concern inertia and reactivity. They increase with the speed with which the agent 
wants to do something, or to do it more quickly. They represent dynamic production 
costs which are linked to the intensity of effort and the speed of doing. Reactivity is 
one of the most important feature of modern organizations. 

• Costs to stay, maintenance and operational costs are costs to repeat an action. They 
are related to the difficulty of the way of doing a task, and also to boredom and 
displeasure. Weariness appears with repetitions and successions of several actions, 
without enough time to recover. 
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• Cognitive costs concern knowledge acquisition, snch as exploration costs or costs to 
acquire information and knowledge. Cognitive costs concern deliberation costs, costs 
to choose, to eliminate, to renounce, costs of knowledge acquisition. Knowledge costs 
are costs to solve a problem. They depend on the residual difficulty to solve this 
problem. Deliberation costs before action belong to the category of exploration costs 
(Conlisk, 1996). Learning costs can be behavioral costs to learn how to do a task 
differently in a better way, they are costs of "ways of doing" like recipe costs, and 
costs to "learn how to learn" . Costs of sampling and gathering information by trials 
and errors are prominent at the beginning of a learning process. The agent must 
determine if a choice is an error or not. 

• Psychological costs include costs to direct attention, costs coming from emotions, 
stress, anxiety, costs of doubt, ambiguity and risk, costs to accept to choose early, 
to eliminate possible solutions, or to renounce to some opportunity, to fear to regret 
without enough knowledge and information. 

• Intertemporal or transition costs are linked to the costs of not reaching a target im- 
mediately. They represent the costs of intermediate sacrifices which the agent has to 

accept before reaching a final goal. These costs include impatience, resilience, boredom 
from repetition, stress, costs to change to often, costs to wait, to delay, costs to accept 
to be mistaken. 

Costs to move take very different shapes, depending on the nature of the task: consuming, 
working, coaching, managing. 

2. Costs to Switch between Tasks (Inter Costs): 

Costs to switch from one task to another concern costs of dissimilarity, and, as before, 
costs to start actions, and costs to stop (inhibition costs). Costs of doing new actions 
increase with the degree of dissimilarity with past actions. Ehrlich (1975) insists on the fact 
that similar actions following a given action are much easier to do than disconnected actions 
belonging to different fields of competence. Costs of doing several tasks at the same time 
and costs of specialization (costs of doing the same task again and again) are intermingled. 
Inhibition costs are the costs to forget and leave a given task to be able to start a new 
task. Inertia costs can be addiction costs to stop consuming drugs, and costs to escape from 
habits. The more often an agent changes, the easier he changes again, or the more tiresome 
it is. 

Adaptation and adjustments costs regroup costs to switch from one task to another, 
when an agent has to make several tasks (extensive aspect) and costs to change the "way of 
doing" a given task (intensive aspect) if he makes only one task. These costs to change the 
"way of doing" a task are behavioral. 

Switching costs (Klemperer, 1995) concern adaptation costs for consumer tasks. For a 
consumer, to consume a good is a task, a kind of production act, a consumption recipe. 
Switching costs are adaptation costs for a consumer who chooses to move from the con- 
sumption of one good to another. They concern costs to lose compatibility with his existing 
physical equipment, his current knowledge, relational costs to switch from a supplier to a 
new one, costs to lose discount coupons, costs to learn using new brands, psychological costs 
to try new products. 

3. Interaction and Network Social Costs: 

Costs to stay and costs to change concern also interaction costs between agents, notably, 

network costs to keep, build, or delete links. Transaction costs arc costs of making an 
economic exchange, such as the effort required to find out the prefered variety of a product, 
the travel time between home and store, queuing time. Transaction costs include search 
and information costs, bargaining costs and policing, monitoring and enforcement costs. 
Following Coase (1937) and then Williamson (1975), the determining factors of transaction 
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costs arc frequency, specificity, uncertainty, bounded rationality, and opportunistic behavior. 
Transaction costs, or frictions are dynamic (Langlois, 2005). They concern coordination and 
synchronization costs, costs to improve fits, cooperation costs, costs to exchange and build 
common knowledge, and organizing costs. 

3.3 Modelling behavioral costs to move and costs to stay 

For an agent having reached a given state x £ X, we model 

1. Costs to stay or maintenance costs (Operational costs): 

If the agent stays at x, he must pay maintenance costs m{x), or costs to stay at 
X. Costs to stay a.t x <E X include the physical costs to repeat x and the psy- 
chological costs to feel a persistent frustration coming from the unsatisfied needs 
f{x) = /x(x) — g{x)) > 0. We will embed maintenance costs in the instantaneous 
utility gain, g{x) = (fi{x) — m{x), where (fi{x) is the gross utility. 

2. Physical and Behavioral Costs to Move: 

If the agent moves, he must pay physical, physiological, psychological, and cognitive 
costs of moving from x to some better state y. Let C{x, y) > he the cost to move 
from X to y. 

3. Cognitive Exploration Costs : 

If the agent moves, he must spend exploration expenditures K{x) to know if, after 
exploration, it will be worthwhile to move from x to some better state y. At each 
step, and before moving, exploration concerns the estimation of the two terms of the 
balance between advantages to move A(x,y) and costs to move C{x,y). Exploration 
costs are static, moving costs are dynamic. 

4. Opportunity Costs to Move: 

Costs to move include opportunity costs not to exploit current opportunities. If, 
moving from x to y, the agent fails to exploit his utility g{x) during a time t{x, y), the 
corresponding opportunity costs are 0{x,y) = t{x,y)g{x). 

Because of space limitation, and the huge variety of costs to change, we give a reduced 
mathematical form for costs of moving. The set of alternatives X is a complete metric space. 
The distance between two alternatives x G X and y € X is an index of dissimilarity between 
them. The physical costs of moving from a; to y is decomposed into 

C{x,y) = e{x,y)d{x,y) = t{x,y)c{x,y). 

Costs to move are zero if the agent does not move: C{x,x) = 0, or t{x,x) = 0, and 
c{x,x) = 0. These formula define the per unit of distance cost to move e{x,y) > and 
the per unit of time cost to move c{x, y) > 0. Efforts per unit of distance are not sym- 
metrical: e{y,x) ^ e{x,y), so costs to move are not symmetrical either. We can have 
C{x,y) ^ C{y,x). This justifies the use of relative entropy such as Kullback-Liebler or 
Bregman distances. 

With regard to the dependence of the costs to move with respect to the distance, we 
distinguish two classes: 

• High local costs to move which correspond to a minimum per unit of distance effort 
of moving e{x,y) > e > 0. As an example C{x,y) = ed{x,y), whose corresponding 
mechanical notion is dry friction. This situation is examined in section 8. 
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• Low local costs to move which is the complementary class. This includes the important 
case C{x,y) = ed{x,y)'^, whose corresponding mechanical notion is viscous friction. 
Because of the presence of the square, small changes induce very small costs (such 
as passing from 1/10 to 1/100). In that case, convergence is more difficult to prove, 
it requires extra geometrical assumptions on the gain or utility function g (such as 
quasi-convexity or analyticity) . A result illustrating this situation (local search and 
proximal algorithm) is given in section 9. 

The dependence of costs to move with respect to time is also a rich topic. The term 
t{x,y) is the time spent to move physically from x to y. Among them, examples are reactivity 
costs to move C{x,y) = t{x,y)c{x,y), where the instantaneous cost to move c(x,y) = 
p {x, d{x, y)/t{x, y)) depends on the mean speed of moving v{x, y) = d{x, y)/t{x, y) (Attouch 
and Soubeyran, 2006). The cost to move c{x,y) = p{x,d{x,y)/t{x,y)) increases more or 
less with speed. 

4 The Goal-Setting Context: Intermediate Advantages 
to Move 

Intermediate advantages to move include intermediate aspiration levels with psychological 
aspects. For each state {x,g{x)) & X x R, let g{x) > g{x) be the intermediate aspiration 
level of the agent, which is an estimation of the unknown supremum 'g for example, or any 
other level. 

Reference-dependent advantages to move: Assume that, after exploration around the 
current state x and given his aspiration level 'g{x), the agent can find an improving state 
y € E (x^r{xy) , but failed to reach his aspiration level, g{x) < g{ri) < g{x) for all 77 G 
E {x,r{x)) . At y € X, the agent will have two opposite feelings. The gap g{x) — g{x) > 0, 
is a half full or half empty bottle: 

• He has first a satisfaction feeling co{x, y) = \{x) {g{y) — g{x)) to have improved, g{y) — 
g{x) > 0, where A(a;) > is the weight the agent puts on satisfaction; 

• He has also a disappointment feeling f{x,y) = ^{x)n{x,y), where iJ,{x) > is the 
weight the agent puts on disappointment, to have been unable to satisfy his initial 
ambition, which decreases with n{x,y) = g{x) — g(y) > 0; 

• The agent can put a weight vgiy) on the improving utility g{y), where v = ^{x) > 0; 

Instantaneous weighted advantages to move are 

a{x,y) = i'g{y) + X{g{y)- g{x))- n{g{x)- g{y)) 

= vg{x) + {\ + v) {g{y) - g{x)) - ^ {{g{x) - g{x)) - {g{y) - g{x))) 
= (A + /X + !/) {g{y) - g{x)) + ug{x) - n {g{x) - g{x)) . 

At each step, the reference is the advantage or disadvantage to stay 

a{x, x) = vg{x) - p {g{x) - g{x)) 

which is the frustration to stay at x G X. The agent considers the difference between his 
advantages to move and his advantages to stay 

a{x, y) - a{x, x) = {X + p + ly) {g{y) - g{x)) . (11) 

This is a reference-dependent criterion for the advantage to move. 
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The agent may also consider the dynamic advantage to move, which is the advantage 
to move multiplied by the duration of hoping to benefit from this advantage to move. Let 
t{y) > be the length of time the agent chooses, starting from .x, to stay at y and to exploit 
the benefit of this new state y. The temporary "reference-dependent" advantage to move 
with respect to stay is 

A{x,y) = t{y){a{x,y) - a{x,x)) (12) 
= t{y)6{x) {giy) - gix)) > (13) 

where 

5{x) = X{x) + ii{x) + v{x) > (14) 

is the character index of the agent. This formulation includes "improving", when the agent 
is not much goal-oriented, with A(a;) = ii{x) = 0, z/(x) > 0, and "improving enough," when 
the agent is much more goal-oriented, \{x), ix{x),v{x) > 0. 

5 Punctuated "Exploitation-Exploration" and Moving 
Processes 

We define the main ingredients of a generic exploitation-exploration phase [x] . This notation 
includes the state x and the control variables h{x), a{x), r[x) which are described below. 

Starting from x, the duration h{x) > of an exploitation-exploration period is chosen 
first. Then, the agent chooses how to share h{x) between exploitation and exploration. At 
each instant he spends t{x) = a{x)h{x) units of time to exploit his instantaneous utility 
g{x) and (1 — a{x)) h{x) = t{x) units of time to explore around. Thus, a{x) € (0, 1) and 
h{x) > are control variables. 

During the phase of exploitation-exploration, the agent stays at x £ X. The longer the 
agent exploits the utility g(x), the longer he can explore around, for a given share of time 
1 — a{x). If the agent, after exploration and moving, has reached the improved state x 
with respect to a previous state xq such that g{x) > g{xo), and chooses to stay at x during 
t{x) > 0, the longer he spends to choose t{x), the larger his advantages to move from Xq to 
X, A{xo,x) = t{x)S{x) {g{x) — .9(xo)) . A fixed horizon limits this possibility. 

At each step, the agent has some exploration expenditures K = K{x) = T{x)k{x) > 
to explore around the current state x. This amount of resource K{x) helps him discover 
local advantages A{x,y) and costs to move C{x,y) around the current state x. As before, 
to simplify, we take the exploration expenditure equal to one, fc(x) = 1. The size of the 
exploration set increases with exploration expenditures, r{x) = r {x, K{x)) . 

The necessity to repeat exploration at each step comes from the anchoring aspect of 
advantages and costs to move. The agent discovers all the advantages A{x, y) and costs to 
move C{x,y) from x to y € E{x,r{x)) , and then moves from x to y; he does not know 
yet the advantages and costs to move A{y, z) and C{y, z) from y to z. This forbids him to 
consider cumulative payoffs as well. 

At the end of the exploitation-exploration period [x] of duration t(x), the agent es- 
timates the advantages and costs to move for all states within the local exploration set: 
y G E{x,r{x)) C X. He compares them, and chooses to stay at x or move from x to some 
y €E E{x,r(x)) . If for some y G E {x,r(x)), advantages to move are greater than costs to 
move (including opportunity costs to move 0{x,y)), say A{x,y) > {C{x,y) + 0{x,y)), 
where ^ > is a given non sacrificing ratio, he will rather move than to stay at x. Explo- 
ration around is a way for him to be rational. If he takes the decision to move from x to 
y, the agent will leave the static temporary routine phase [x] to enter into a phase of mov- 
ing, denoted by [a;, y] , which links the previous static exploitation-exploration phase [x] to a 
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new exploitation-exploration phase [y] . A new static period [y] of exploitation-exploration 
follows a phase [x, y] of moving. 

Given the distance d{x, y) b(^twe(^n x and y, the control variables of the moving phase are 
its duration t{x,y), the instantaneous cost of moving c{x,y), and the per unit of distance 
cost e{x,y). Costs of moving 

C{x, y) = t{x, y)c{x, y) = e{x, y)d{x, y) (15) 

and the modulus of the speed of moving 

link these choice variables: 

c{x,y) = e{x,y)p{x,y). (17) 

The context may lead us to add opportunity costs of moving 0{x,y) = t{x,y)g{x), when 
we consider a single agent, who has a strict time constraint. These opportunity costs arc 
the lost gains, because the agent is compelled to stop exploitation during the moving phase 
of duration t{x,y). For a group, such opportunity costs do not exist because the group 

can escape from time constraints and exploit the amoimt t{x,y)g{x) during this period of 
moving. The group can hire workers to move from x to y. For a single agent the balance is 

A{x,y)>C{C{x,y) + 0{x,y)) (18) 

and A{x,y) > ^C{x,y) for a group. 



6 Behavioral Dynamics: Not Too Many Intermediate 
Sacrifices 

The "worthwhile-to-move" "decision + action" process follows a punctuated dynamic, an 
alternation of static and dynamic periods of exploitation-exploration and periods of moving. 
The agent faces contradictory choices, which lead him to use heuristics of choice before 
taking action. 

6.1 A Punctuated Dynamic 

Starting from x the agent first chooses the duration h{x) of the exploitation-exploration 
period. Then, the agent makes a second choice. At each time, a fraction of time a{x) > 
is devoted to exploitation, and the other fraction 1 — a{x) > to exploration. He exploits 
during the duration t{x) = a{x)h{x) and explores during the duration (1 — a{x)) h{x). Con- 
sider a punctuated dynamic, a "stop and go" sequence made of three periods starting with 
a static exploitation-exploration period [.x] , followed by a period [x, y\ of moving and a new 
static exploitation-exploration period [y] of respective lengths h(x) > 0, t{x,y) > and 
h{y) > 0. The decision to stay at x and to lengthen the period [x], or to move from x to 
y, entering the moving period [x, y] to reach the new temporary routine period [y] , is, by 
definition, taken at the end of period [x] . At the beginning of the period [x, y] of moving, in 
order to decide to move or not from x to y, the agent compares the estimated incremental 
advantages to move A{x, y) ~ t{y)S{x) {g{y) — g{x)) to the estimated costs to move "with- 
out opportunity costs" C{x,y) = t{x,y)c{x,y), or costs to move "with opportunity costs" 
C{x,y) + 0{x,y). Exploration costs K{x) are not involved in the costs of moving C{x,y), 
but in the larger category of costs to change. 
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6.2 The "Worthwhile-to-Move" Inclusion 

For the sake of simplicity, we consider the case of an agent with "no opportunity costs" to 
move (the other case wiU be examined just after) . For an agent following this punctuated 
dynamic, it is worthwhile to move from x to ?/ if estimated net advantages to move A{x, y) 
arc greater than estimated costs to move C'{x,y): A{x,y) > C{x,y), or some portion < 
^{x) < 1 of them, A{x,y) > ^{x)C{x,y). This means that some temporary sacrifices are 
allowed. In real life, at each step, the agent accepts to make some temporary sacrifices. 
He does not include some fraction (1 — ^(.x)) C(x, y) of the costs to move C{x,y) in the 
balance. This helps him to hope to improve more easily from g{x) to g{y) > g{x). The 
forgotten portion of costs to move, (1 — ^(x)) C(x, y) is traded against an expected increase 
of the speed of improvement, because less exploration around is required to estimate and 
satisfy the worthwhilc-to-movc constraint. The ratio 1 — ^(x) > is the chosen sacrificing 
rate, while its complement ^(x) is the non sacrificing rate at x G X. 

Define the "worthwhile-to-move" relationship without opportunity costs 

xeX^W{x)^{yeX : A{x,y) > ax)C{x,y)} (19) 

and the explored portion of it W^{x) = W{x) n E (x, r(.x)) . This relationship is clarified by 
considering both the respective lengths of the periods t{x,y) and t{y), and the advantages 
and costs to move: 

A{x,y)>ax)C{x,y) (20) 
i{y)^{x) (giy) - aix)) > S,{x)t{x, y)c{x, y) = C(x)e(x, y)d{x, y). (21) 
^ 9{y)-9{x)>e{x,y)d{x,y) (22) 

where 

if t{y) > and 6(x) > 0. The worthwhilc-to-move relationship becomes 

W{x) ^{yeX: g{y) - g{x) > e{x, y)d{x, y)} C X. (24) 

A worthwhile-to-move step is such that the ratio between the advantages to move and 
the distance to move, {g{y) — g{x)) /d{x,y) is higher than the acceptable transition rate 
0{x,y)>O. 

Consider the case where opportunity costs 0{x,y) = t{x,y)g{x) are included in the 
comparison between advantages and costs to move. The worthwhile-to-move relationship 
becomes A{x,y) > ^{x)C{x,y) + ^{x)0{x,y). Introducing the speed of moving v{x,y) = 
d{x,y)/t{x,y) gives the previous situation. In this case, the time spent to move is t{x,y) = 
v{x,y)d{x,y) and opportunity costs to move are 0{x,y) = g{x)d{x,y)/v{x,y). The accept- 
able transition ratio becomes 0{x,y) = ^{x) [e{x,y) + {g{x)/v{x,y))] / [t{y)S{x)] . 



6.3 Topology and Tychastics. 

The main difficulty of decision-making in a complex environment is to manage the transition 
toward a final goal, balancing between intermediate state-dependent advantages and costs 
to move. During the transition, we acknowledge our ignorance of what the agent is exactly 
doing, when costs to move are important to consider, a particular case of the "satisficing- 
by-rejection principle" of Martinez-Legaz et al. (2002), and Soubeyran (2006). 

The agent rejects the moves that require too many intermediate sacrifices, not "worth- 
while - to-move" , because advantages to move are lower than a certain proportion of costs 
to move. The agent rejects alternatives which are perceived as too much sacrificing. In- 
equalities or set inclusions are convenient to uncertainty with no knowledge of probability 
distribution (Aubin, 2005). 
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At each step, the first problem is ignorance. Exploration concerns knowledge acquisition 
around a given state (structured information), instead of information acquisition. Explo- 
ration helps the agent to discover his taste locally, with reference to the current state, and 
to better know his feelings about advantages and costs to move, to estimate his resilience to 
effort, to build his current ambition. It raises the question of how to choose the radius of the 
exploration set r{x) > 0, and of how much to spend for exploring around the current state. 
Is it worth exploring to know if it is whorthwhile to move? When, after having explored 
around a given state, is it worthwhile to move? To decide ex ante how much to explore 
starts an infinite regression which is difhcult to cut. Heuristics of exploration can help do 
that approximatively. To escape from this regression problem, we assume a constant size 
of the exploration set, r{x) = r > 0. If the agent follows a "worthwhile-to-move" process, 
he will finally enter into a clairevoyance ball and optimize. We also assume that the agent 
explores more or less, depending on his motivation to change. 

Based on a "worthwhile-to-move" behavior as the reference, we suggest a classification 
of behaviors according to the following four categories (they all are special cases of the last 
category): 

a) Low goal- oriented behaviors including 

• Improving: g{y) > g{x); 

• "Worthwhile-to-move", "not too many sacrificing" behaviors: A{x,y) > £^{x)C{x,y); 

b) High goal- oriented behaviors including 

• "Improving enough" : g{y) > g{x)-\-e{x), e{x) > which is equivalent to "intermediate 
satisficing": g{y) > g{x) = g{x) + e(x); 

• "Intermediate satisficing with not too much sacrificing": A{x,y) > ^{x)C{x,y) and 
g{y) > 9{x). 

The "not too many intermediate sacrificing" constraint defined in Eq.© refiects the 
presence of costs to move. 

In all cases, because of some ignorance, there is an adjoint state-dependent exploration 
process E{.) : x £ X i — > E{x,r{x)) devised to discover the state-dependent advantages and 
costs to move. At each step, before moving, the agent explores a portion of the worthwhile- 
to-move set, W^{x). 

The model is made of three blocks: 

i) Goal setting block: y e / [x, e{x)] ^ {y e X : g{y) > g{x) + e{x)} ; 

ii) Moving block during the transition, y G W{x); 

iii) Exploration block: y G E (x,r{x)) . 

At each static period we introduce intermediate payoffs. The choice of the length of each 
exploitation phase partially determines the size of each intermediate payoff as an endogenous 
intermediate goal. During static periods the agent exploits the benefits or the net utility 
g{x) coming from the repetition of the present alternative x, then explores around for an 
alternative y G E{x, r{x)) which improves enough. During dynamic periods the agent moves 
only. 

7 Increment alism through Enclosing 

7.1 Using an Enclosing Heuristic to Enclose the "Worthwhile to 
Move" Inclusion 

We show how the agent manages the "worthwhile-to-move" punctuated transition, using an 
enclosing heuristic. At each step, the agent sets upper and lower bounds on several control 
variables. He encloses these variables within intervals. 
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The maximum duration of exploitation is < t{y) < t, the minimum per unit of time 
effort of moving e{x,y) > e > and the minimum rate of non sacrificing ^(a;) > ^ > 0. 
Maximum weights over satisfaction and deception are d{x) < S. The acceptable transition 
ratio 0{x,y) > (defined in Eq. ((23|) section 6) is higher than a satisficing transition ratio 
= He)/ {t6)>0. 

In the presence of opportunity costs to move, if the speed of moving is not too high, 
< v{x,y) < V, the satisficing transition ratio is 9{x) = £,{e + g{x)/v)/ (t5) > — ^(e + 
g{x(j)/v)/ (15) because along an improving trajectory g{x) > g{xo). In both cases, following 
a worthwhile-to-move process, for any consecutive steps x and y, the following inequality 

9{y) - .9(2;) > Od{x,y) 

holds. Let S{x) — {y X : g{y) — g{x) > dd{x,y)} D W{x) be the enclosing worthwhile- 
to-move set. The enclosing heuristic is powerful because it defines a pre-order on the state 
space X. The enclosing worthwhile-to-move relationship x ^ X 1 — > S{x) C X is reflexive, 
X € S{x) for all X & X and transitive. 

If the distance effort of moving e{x,y) is a strictly increasing function of the speed of 
moving, e{x,y) = rj {v{x,y)), and if the speed of moving is higher than a strictly positive 
minimum level, v{x, y) > v> 0, the effort per unit of distance e{x, y) will be higher than a 
strictly positive level e{x, y) > e = r/(t;) > for all x.y G X. 



7.2 More on Incrementalism: Convergence in Finite Time 

Each period consists of a static period [x] of exploration-exploitation of duration h{x), where 
X = Xn, and of a dynamic moving period [x, y] from x to a new improving state y = Xn+i of 
duration t{x,y). A new period of exploration-exploitation [y] starts. At each exploration- 
exploitation period, the time spent for exploitation is t{x) = a{x)h{x), and the time spent 
for exploration is (1 — a{x)) h{x), with < a{x) < 1. 

The total time spent for exploration-exploitation and moving along a worthwhile-to-move 
and satisficing trajectory is 

+00 

T ^^^{h{Xn) +t{Xn,Xn+l)) ■ (25) 
n=0 

If the speed of moving is lower bounded v{x, y) > v > 0, the relationship between speed and 
distance implies that d{x,y) = t{x,y)v{x,y) > t{x,y)v. Subsequently, 



+ C30 +00 

t{Xn,Xn+l) < - d{Xn,Xn+l) < +00, (26) 



n=0 

this last inequality being a consequence of the "worthwhile-to-move" principle (see subsec 
tion 2.2). 

If the agent exploits a fraction of time a{x) > a > greater than or equal to a given min- 
imum level a > and if the total time spent for exploitation is finite, (^J2n=o ^(■^") ^ 
then: 

-l-oo +00 , < +00 

t{Xn) 



^ h{x,,) = ^ 1^ < 1 ^ t(x„) < (27) 



^ a{xn) 



a 

— Tl = 



Adding the two inequalities and (^7)) and using Eq. ips)) gives the convergence in finite 
time. 
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8 Instrumentalism: Shrinking of Worthwhile-to-Move 
Behaviors which "Improve Enough" 

Consider a more goal-oriented "worthwhile-to-move" process where, in an inertia context, 
the agent wants both, to follow a "worthwhile-to-move" transition 

A{x,y)>^{T)C{x,y) 

where the agent must compensate intermediate costs to move by intermediate advantages 
to move, in such a way that advantages to move are higher than a given fraction of costs to 
move; and to "improve enough" : 

9{y)-9{x) > e{x). 

Wc shall show that, if 

• the instantaneous utility function is upper bounded, 

• the agent is able to enclose the "worthwhile-to-move" inclusion y € W{x) in an en- 
closing inclusion y G S{x), 

then, such a "worthwhile-to-move" and "temporary satisficing" process shrinks. The radius 
p{S{xn)) = supj,£g(^^) d(a:„,y) of the enclosing inclusion converges to zero. Hence the 
enclosing inclusion and the "worthwhile-to-move" inclusion shrink. 

8.1 "Improving Enough" Worthwhile-to-Move Behaviors 

Consider first a goal-oriented setting process which is not "worthwhile-to-move". This 
defines an intermediate satisficing process where the agent wants to "improve enough" 
(Soubeyran, 2006). At each step, the agent at x sets a new aspiration level g{x) > g{x), 
then sets an adjoint satisficing level g{x), g{x) < g{x) < g{x), and tries to reach it. He 
must explore in an exploration set E {x,r{x)) C X around x to find some y such that 
9{y) > 9{x)- The agent adapts his aspiration level. Lot g = sup : y G X} be the finite 
supremum of the upper bounded utility function g{.). The agent sets a feasible aspiration 
level, g{x) < 5 < -l-cxo. Otherwise, he does not reach it and, sooner or later, will have either 
to further explore around the current state or to relax his aspiration level. 

If the agent follows both a "worthwhile-to-move" y e W{x) and an intermediate satis- 
ficing process and if he has succeeded to enclose it in the enclosing process y S S{x), by 
exploration around the current state, the agent can discover locally his utility function as 
well as the enclosing inclusion 

S{x) = {yeX: g{y)-g{x)>9d{x,y)}. 

Let 

s{x) = sup {g{y) : y G S{x)} <g<+oo (28) 

be the highest unknown aspiration level that the agent can reach in the enclosing inclusion. 
Let 

g{x) = g{x) -\-p{x) {s{x) - g{x)) ; with p{x) > (29) 

be the unknown relationship between the aspiration level g{x) and s{x). If starting from x, 
and exploring around, the agent can find a "worthwhile-to-move" state y G S{x) C W{x) 
which improves enough, or 

9iy) - 9{x) > q{x) {g{x) - g{x)) = e{x) (30) 
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where the rate of need reduction q{x) £ ]0, 1[ is greater than or equal to a minimum level, 
< q < q{x) < 1 for all x £ X, then, from g{x) ~g{x) = p{x) {s{x) — g{x)) , the intermediate 
satisficing level is 

g{x) = g{x) + q{x) {g{x) - g{x)) (31) 
= gix) + p{x)q{x) {s{x) ^ g{x)) (32) 
= gix) + a{x) {s{x) - gix)) (33) 

where < a < (t{x) = p{x)q(x) < 1. The "improving enough" condition is defined by 

9{y) ~ 9{x) > p{x)q{x) {s{x) - g{x)) = a{x) {s{x) - g{x)) . (34) 

Take < a < <7{x) — p{x)q{x) < 1, which is always possible, although the agent must 
explore enough to manage to do that. This requires to set a high enough aspiration level 
g{x) with respect to its highest feasible level s{x) (the rate p{x) must be high enough, 
< p < p{x)), and to try to fill a "large enough" fraction 1 > q{x) > q > of the aspiration 
gap g{x) — g{x) > 0, such that < < (7[x) — p{x)q{x) < 1, where < g_ = pq < 1. This 
means that if the agent chooses a small rate q{x), he can choose a large rate p{x). This is 
the case if he has chosen a high enough aspiration level g{x) . 

Starting from any x, an "improving enough" and "worthwhile-to-move" state y £ >5'(a;) C 
W{x), such that Eq.lfM]) is satisfied with < a< a{x) < 1, is always possible because s{x) 
is a supremum. 

8.2 Shrinking 

The inequality s{x) — g{x) > g{y) — g(x) > 9d{x,y) for all y £ S{x) implies that 

six)-gix)>9 piS{x)). (35) 

We now show how the worthwhile-to-move process shrinks. We assume that the agent can 
find some Xn+i G S{xn) C W{xn) which improves enough: 

g{Xn+i) - g{Xn) > g_ [s[Xn) ~ g{Xn)) ■ (36) 

The Shrinking Proposition: // 

i ) the utility function is upper bounded, 

ii) the agent has succeeded in defining and enclosing his worthwhile-to-move process, 
Hi) the agent explores enough around to be able to find Xn+i G S{xn) C W{xn) which 

improves enough, or such that g{xn+i) — g{xn) > o_ [s{xn) — gixn)] > 

then, the worthwhile-to-move process shrinks, which means that the radius p{W{xn)) of the 
worthwhile-to-move set converges to zero. 

Proof: By assumption iii) and using Eq. (j35p 

g{Xn+l) - g{Xn) > g:[s{Xn) - g{Xn)] (37) 

> a9 p{S{xn)). (38) 

We saw that a worthwhile-to-move process is such that g{xn) — *■ g* ■ This implies that 
p{S{xn)) — > as n ^ -t-oo.D 

Instrumentalism is a key feature of this process: the goal changes along the process, 
because the unsatisfied needs g{xn) — g{xn) decrease each time. 
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8.3 Convergence to a Rest point: Stopping with no Residual Frus- 
tration 

When does the agent prefer to stop moving? In such cases, his aspiration gap must vanish, 
or — g{x*) = 0, with no residual frustration. 

Behavioral Rest Points: As seen before, a performance x* € X is said to be a behavioral 
rest point if, for any y € X with y x* , it is not worthwhile to move from x* to y. This is 
equivalent to say that x* is a rest point clement of the "worthwhile-to-move" relationship 
X G X I — > W{x) C X, or W{x*) = {x*} . The agent has no further incentive to move. 
Thus, x* is a behavioral rest point iff for any y ^ x* , A{x*,y) < ^{x*)C{x* ,y), or with 
opportunity costs A{x*,y) < ^{x*) {C{x*,y) + 0{x*,y)) . This is the case if S{x*) = {x*} 
because S{x) 3 W{x) and x e W{x) for all x € X. 

In terms of instantaneous advantages and costs to move, x* G X is a. behavioral rest 
point if, for any y ^ x* , 

g{y)-g{x*) < e{x* ,y)d{x* ,y). 

In previous sections 2.2 and 8.2, we showed that a "worthwhile-to-move" process which 
"improves enough," namely {x^+i G n G N} converges toward some limit x* G X, 

and that the radius p{S{xn)) of the enclosing inclusion S(xn) 3 W{Xn), n G N, goes to 
zero. But this does not imply that p{S{x*) = 0, which is equivalent to S{x*) = {x*} . When 
is it the case? When the agent stays at x* rather than moves again? This will support the 
existence of behavioral rest points. The upper semicontinuity of the utility function g{.) is 
a sufficient condition. The agent follows a punctuated dynamic of temporary routines to 
reach a permanent routine. For the sake of clarity, we give a reduced form of our model. 

A Reduced Form of the "Worthwhile-to-Move" Model. A "worthwhile-to-move" 

and "improving enough" process has the following reduced form: 

i) Starting from x = Xn & X and g{x) G R, set an intermediate aspiration level g{x) > 
g{x) and set a feasible intermediate satisficing level g{x) such that g{x) < g{x) < g{x). 

ii) Around the current state x ^ X, define a subset of local acceptable transitions y G 
W{x), which determines a worthwhile-to-move behavior W{.) : x € X i — > W{x) C X, 
X gW{x) at each step. 

Problem: 

iii) Explore around the current state x ^ y € X, in the exploration set E {{x,r{x)), 
choosing the size r{x) > of the exploration set. 

iv) Find y = Xn+i G W{x) n E {x, r{x)) = W^{x) C X such that ^(a;) < g{x) < g{y) < 

v) If g{y) < g{x), start from y G W^{x) and set a new intermediate goal g{y) < g{x); 
and so on. 

vii) Stop at X* G X, when W{x*) = {a;*}, when it is no longer worthwhile to move: 
the satisficing paradox is solved. In that case, the agent has no residual frustration feeling, 
because his aspiration gap is zero: g{x*) — g{x*) = 0. 

The Worthwhile-to-Move Theorem. Assume that 

i) the state space X is a metric space with metric d, 

ii) the instantaneous utility function g{.) is upper bounded, 

iii) the agent uses a heuristic enclosing the worthwhile-to-move inclusion y G W{x) 

W{x) = {yeX: A{x,y) > ^{x)C{x,y)} 
in the inclusion y G S{x), 

Six) = {yGX: g{y) - g{x) > ed{x, y)} , 6 > 0. 
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This means that the agent limits the control variables: he sets a maximum duration of 
exploitation < t{y) < t, a minimum effort of moving e{x, y) > e > and a minimum 
rate of nan sacrificing £^{x) > C > 0. He also puts maximum weights on saMsfaction and 
deception, and < S{x) < S. Then, the acceptable transition ratio is greater than or equal 
to a strictly positive level, 6{x, y) > > 0, where the minimum acceptable transition ratio is 

= mi (ti) > 0. 

Then, 

a) The worthwhile-to-move, but, for the moment, not satisficing process y e W{x) can be 
enclosed in the nested enclosing process y £ S{x): 

W{x) C S{x) for all x G X. 

b) Let Xn+i e W{xn) and Xn+i G <S'(x„), n G N, X{i £ X given, be the worthwhile-to-move 
and the enclosing process. If the state space is complete, the worthwhile-to-move process 
converges, Xn — > x* & X as n ^ +oo whatever the starting state xq G X. It converges in 
finite time if, along the process, the total time spent for exploitation is finite and the speed 
of moving v{x,y) = d{x,y)/t{x,y) is greater than or equal to a strictly positive level v > 0. 

c) Moreover, if the worthwhile-to-move process improves enough, which is the case if the 
agent "explores enough" around, at each step, then, the process not only converges, but also 
shrinks: p(S(xn)) — >■ 0. 

// the per unit of time utility function is upper semicontinuous, then, the limit state x* G X 

is a behavioral rest point: S{x*) = {x*} =^ W{x*) = {x*} . 

The agent stops at x* with no residual frustration: g{x*) = g{x*). 

Proof: Only the last point c) needs proving. 

Each enclosing set S{x) C X is closed, this is a consequence of the upper semicontinuity 

of g. The enclosing inclusion is nested. If Xn+i G S{xn) for all n G N, and p > n then, 
S{xo) D S{xi) D ... D S{xn) D ••• D S{xp) D ... Thus, Xp € S{x„) for all p > n, for 
any given n € N. Prom Xp — > a;* as p — > +oo, x* € S{xn) for all n S N, because 
S{x) is closed for all x G X. By transitivity of 5*, S{x*) C S{xn) for all n G N, which 
implies, by the Shrinking Proposition, that < p{S{x*)) < p{S{xn)) — > 0. This gives 
p{S{x*)) = ^ S{x*) = {x*} because x G S{x) for all x G X. Thus, W{x*) = {x*} 
because of the enclosing heuristic W{x) C S{x) for all x G X. 

8.4 Link with Ekeland's £- Variational Principle 

As a striking application of the "worthwhilc-to-movc" theorem, one obtains a cognitive ver- 
sion of Ekeland's variational theorem (Aubin and Ekeland, 1984; Attouch and Soubeyran, 
2006). This theorem (Ekeland, 1974) was originally a regularization theorem devoted to 
ill-behaved optimization problems, through approximate optimization. We give its "supre- 
mum formulation" in order to follow the tradition in economics which is usually concerned 
with maximization or approximate maximization problems. It concerns the case where the 
function g{.) possibly has no maximum on X. 

Ekeland's Theorem: Let X be a complete metric space with distance d. Let g{.) : 

X G X I — > g{x) G ^ be an upper bounded and upper semicontinuous function. Let 
g = s\xp^^x9{^) < +0O, 9 > Q, and £ > 0. Then, for any xo G X such that n{xo) = 
g — g{xo) < e, there exists a x* G X such that i), ii) and Hi) are satisfied: 

i) g{x*) > g{xo), 

ii) £ > 0d{xo, X*), 

Hi) g{y) - ed{x* ,y) < g{x*) for all y x* , or x* G avgmax {g{y) - dd{x* ,y) : y G X} . 

The condition n{xo) — g ~ 5(^0) < ^ means that the initial need or frustration n(xo) is 
less than a given e > 0. Statement i) means that the final position x* G X is improving with 
respect to the initial position xo- Statement ii) tells us that the highest possible advantage 
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e to move from the initial to the final position is greater than the cost to move 6d{xQ,x*) 
from xq to X* . It is worthwhile to move from xq to x* . The last statement iii) says that x* 
is a maximum of the approximate function y d X g{y) — 9d{x* ,y) G M. It tells us that 
x* is a behavioral rest point because it is not worthwhile to move from x* to any different 
position y ^ X* . 



8.5 From Inertia Inefficiency to Behavioral Rest Point 

Why does a "worthwhile-to-move" transition process end in an inefficient rest point exhibit- 
ing a large inefficiency gap 'g — g{x*) > 0? The answer is: because of inertia inefficiencies, 
defined as resistance to change. Their modelling involves "costs to move", including costs 
to build and to sustain motivation, to set goals, to explore locally, and to move. 

What does make an agent behave inefficiently? What does generate local actions, small 
steps to move, a long transition time, large intermediate sacrifices, large per unit of time 
costs to move, a long time spent in moving, small intermediate improvements and low 
intermediate advantages to move, a short intermediate exploration and exploitation? What 
does generate premature convergence, local optimum, a low limit level of the final goal 
g* , a large inefficiency gap g — g* > 0, convergence in a very long time, a low speed of 
convergence, an irregular punctuated dynamic, some permanent frustration? How can we 
explain the emergence of routines, habits, and rest points? How do the ambivalent aspects 
of habits and routines, both positive and negative, reach a balance? How can inertia lead 
to inefficient behaviors? 

Our answer is given through the two main inequalities which the agent has to manage 
at each step (we set x = Xn and y — x„+i) 

i) "not too many intermediate sacrifices" 

g{y)-g{x)>e{x,y)d{x,y) (39) 

ii) "improving enough" 

g{y) - g(x) > q{x) [g{x) - g{x)] = e{x) > 0, or (40) 

ui) choose to stop, setting s{x) = 0. 

The agent manages the "worthwhile-to-move" inequality Eq. (|39p by choosing the "ad- 
justed non sacrificing ratio," using an enclosing heuristic 

^(:.,y) = f^>^>0 (41) 
t{y)d{x) 

and manages the "improving enough" inequality Eq. (|40p by choosing the satisficing level 
g{x) and the improvement rate q{x). 

In this context, behavioral inefficiencies come from high goal-setting, exploration costs, 
and moving costs. Inertia generates a lot of inefficiencies if the enclosing process gives a 
high lower bound ^ > 0, and if g(x) and q(x) are low. Behavioral inefficiencies and prema- 
ture convergence come from limited needs and strong inertia with not enough intermediate 
sacrificing, coming from too high a "non sacrificing rate" ^(x), too short a horizon h{y), too 
strong a preference for the present, too low an exploitation time t{y) = a{y)h{y), too high 
an effort to move e{v{x)) (coming from too high a speed of moving), too low a psychological 
factor 6{x), too high local exploration costs K{x). 

Then, how, at each step, are determined the "adjusted non sacrificing ratio" 6{x, y), the 
intermediate satisficing level g{x), the rate of "improving enough" q{x), and the exploration 
expenditures K(x)7 

Assume that the agent has limited resources and a finite supremum y < +oo which is the 
highest utility he can reach. Starting from the state x € X, let g{x) be the aspiration level of 
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the agent, which is an estimation of his unknown suprcmum utihty. The lower his incentive 
to change, the lower his initial unsatisfied needs 'g(x) — g(x) > 0. The agent can over- or 
undcT-cstiniate the suprcnnini g. This depc^nds on self-esteem and degree of optimism. If 
his aspiration level is higher than his feasible ambition, 'g{x) > 5, the agent will sooner or 
later, after trials and errors, have to relax his aspiration level, so as to make it feasible, say 
g{x) ^ 5- To fulfill this goal-setting task, the agent can spend time and money to imitate 
successful agents or ask a coach for advice. The agent can try to reach his intermediate 
aspiration level in several steps. He can set an intermediate satisficing level, an intermediate 
goal g{x) between his present utility level and his intermediate aspiration level such that 
g{x) < g{x) < 'g(x). Then he can try to find an intermediate satisficing state y ^W{x) such 
that g{x) < g{x) < g{y) < g{x). 

Consider the intermediate goal-setting inequality 

9{y) - 9{x) > q{x) {g{x) - g{x)) = e{x) > 0. 

To set such an "improving enough" inequality requires to be able to set an intermediate 
aspiration level g{x) = g(x) + p{x) {s{x) — g{x)) and a related intermediate satisficing level 
g{x) = g{x)+e{x) ^ g{x) + q{x) {g{x) - g{x)) . The equation g{x)-g{x) = p{x) {s{x) - g{x)) 
defines the intermediate satisficing level 

g{x) = g{x) + p{x)q{x) {s{x) - g{x)) = g{x) + a{x) {s{x) - g{x)) , (42) 

where < a < a{x) = p{x)q{x) < 1. 

At each step, to set a feasible intermediate satisficing level g{x) < g may be very costly. 

If the agent sets an unfeasible level 'g{x) > 'g, he may take a long time to explore and discover 
that there is no intermediate satisficing state y such that g{y) > g{x). The agent must either 
further explore around the current state x G X or must relax the intermediate satisficing 
level 5(2;), relaxing or not his intermediate aspiration level 5(2;). 

If the agent succeeds in reaching the intermediate satisficing level, g{y) > g{x), he starts 
in goal setting process again, sets a new intermediate aspiration level g{y) < g{y) < g{x) < 
g{y) < g{x), relaxes this level or not. Costs of setting intermediate goals are the time 
lost for trials and errors to set feasible intermediate aspiration levels. Failure to reach an 
intermediate satisficing level can lower his self-esteem and his motivation to improve. There 
are also costs to set too low aspiration levels which do not lead the agent to explore enough. 

The agent may be over-satiated, when his intermediate residual frustration feelings fJ.{x) 
are too low, or his intermediate satisfaction feelings X{x) too high, generating too low mo- 
tivation for change. These terms act positively and negatively on the intermediate goal 
level 'g {x , X(x) , fi{x)) . The agent will have an incentive to "improve enough," to set a high 
enough intermediate satisficing level g{x), which will give him the motivation to explore 
sufficiently, if his satisfaction to improve X{x) remains high, or his discomfort feelings 
to have unsatisfied needs remain high during the process. Discount factors can also play a 
role to estimate advantages to change and costs to change in a different way. 

Consider the "worthwhile-to-move" inequality (the "not too many intermediate sacri- 
fices" condition) 

g{y)-g{x)>e{x,y)d{x,y). (43) 

The larger the inertia index 

0(a: y) = ^(^)^(^) (44) 

^ {X{x)+t,{x) + u{x))t{y)5{x)' ^ ' 

with t{y) = a{y)h{y), the higher the final inefficiency gap g — g* > 0. Small intermediate 

steps d{x,y) may be necessary to divide very convex costs C{x,y) to change. But long 
intermediate steps may be necessary to improve each step enough. Too high a speed to 
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move v{x, y) = d{x, y)/t{x, y) may be very costly. Moving may take too long an intermediate 
time t{x, y) to start, to move, and to stop. Enclosing more or less can lead to more or less 
premature convergence. This may have contradictory effects. A lower index of inertia 
0{x,y) > dix) > gives a larger cone where the agent can improve enough, with not too 
much intermediate sacrificing, hence a more global process, and a higher final utility. 

Sacrificing not enough at each period (^(x) high) may generate premature convergence, 
or does not sustain intermediate motivation to reach the final goal, because the agent does 
not compensate enough intermediate costs to move by intermediate advantages to move. 
Other drawbacks are to take too long a time to change, to bear too high a cost to change, 
or to choose too long a time to exploit and to reach the final goal in a reasonable time, 
adopting a conservative behavior, not learning enough. Furthermore, to exploit too short a 
time t{y) at each period discourages motivation. 

Consider the exploration expenditure function 

K{x) = T{x)k{x) = (1 - a{x)) h{x)k{x) 

which represents the amount of time, energy, and money which the agent chooses to spend to 
explore around a current state x e X. It is usually an increasing function of the unsatisfied 
needs g{x) — g{x), the rate A(.x-) of contentment feelings rate and the rate fi{x) of discomfort 
to have missed the intermediate aspiration level. 

Reference-dependent payoffs forbid to achieve exploration in a single step. They require 
to explore further. Even if, starting from a given state x, the agent has explored a large 
part of the whole space to know not only his utility g{y) but also his costs and time spent to 
move C{x, y) and t{x, y), he will have to explore further, starting from his new state y £ X, 
to discover his new cost and time spent to move from y, C{y, z), and t{y, z). To know before 
exploration how much to explore forbids optimization. Too large or too complex a state 
space requires to explore each step. The smaller the state space, the more global should the 
exploration process be. 

The agent may explore not enough around the current state or too much. He may be 
too shy or too uninterested to explore. "Improving enough" requires to "explore enough". 
The agent will explore enough if he has low exploration costs, or a high enough motivation 
to explore, which lowers the negative feeling of exploration expenditures, and opportunity 
costs, a horizon which is far away. The longer time the agent spends exploring, the less the 
agent can spend in exploiting. This may decrease his motivation to improve and the time 
spent to converge^ may be too long. 

A quick convergence may be good and bad. The speed of convergence may be valuable, 
but not so much if the agent is locked in a low level instantaneous utility. Convergence may 
happen earlier than for a hill climbing process (a local search optimization with no inertia) , 
but perhaps too early! The process may reach too low a local optimum, or more generally 
a rest point (which can be below a local maximum reached by a hill climbing algorithm!). 

8.6 The Clairevoyance Theorem 

Consider that, at each step, the agent chooses the same radius of exploration r(a;„) = r > 
0, n G N. Our result shows that, after some finite time, the worthwhile-to-move set lies 
inside the exploration ball of constant radius, because the radius of the worthwhile-to-move 
set skrinks to zero. After a finite time the agent will optimize. This is a powerful result 
which shows when, under inertia and frictions, an agent optimizes. This helps understand 
the degree of validity of the "as if " hypothesis: even if agents do not optimize, it is "as if" 
agents would optimize. 
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9 The "As if Hypothesis" Revisited: Local Search and 
Proximal Algorithms 

In the "as if hypothesis" economists make "as if agents would optimize, even though they 
do not think it is the case in many situations (Friedman, 1953). 

9.1 The "As if Hypothesis" with Inertia: "A Local Search and 
Proximal Algorithm" 

A natural way to improve the "as if hypothesis" consists of introducing inertia costs into 
classical optimization programs (although Simon does not like to add further costs to solve 
the bounded rationality problem, sec Gilovich and al., (2002): 559-582). 

Optimization algorithms ignore inertia costs, except, in a very implicit way, proximal 
algorithms (lusem, 1995; Attouch and Teboulle, 2004; Attouch and Bolte, 2009). We have 
to interpret the added rcgularization term which characterizes proximal algorithms as a "cost 
to change" . This quite simple but important interpretation completely changes our view on 
the "as if hypothesis" . Our model shows that agents can manage inertia "as if " they use a 
new algorithm which is a mixture between the two following optimization algorithms: 

i) Local search algorithms: sup {g{y) : y € E (a;„, r(a;„))} 

where E {xn,r{xn)) C X is the exploration set at a; = x„ e X, of size r(x„) > 0. Hill 
climbing and simulated annealing algorithms belong to this class of algorithms. 

ii) Proximal algorithms: sup {g{y) — 0c{Xn, y) : y G X} 

where c{x,y) > is the regularization term which makes the goal g{y) more regular. For 
us, the regularization term is a cost to move, which makes proximal algorithms satisfy 
the "worthwhile-to-change" relationship. In proximal algorithms, the criterion which is to 
maximize can be interpreted as a net gain function, which is a way to handle the multi- 
criteria problem (improve the gain function and satisfy without too much sacrificing). 

The "as if mixture consists in solving the optimization problem 

sup {g{y) - 6c{xn,y)) (45) 

j/6B(a;„,r(x„)) 

in order to pass from x„ to Xn+i- This "worthwhile to change" algorithm is the "Local 
Search and Proximal" algorithm, LSP algorithm in short. As for classical optimization, one 
can assume that, at stage n, the agent optimizes up to some approximation level e„, 

Xn+i e £„ - argmax{p(y) - dcixn,y) : y G E {xn,r{xn))} . (46) 

Starting from the current state 2;„ G X at step n, proximal algorithms take the same 
exploration set, the whole space E (x„, r(x„)) = X. From a behavioral point of view this is 
not reasonable, because this means to solve a global optimization problem at each step (the 
substantive case)! Hence, proximal algorithms do not consider the recursive cost problem 
which is "how to choose the size r(a;„) > of the exploration set at each step," "how much 
to explore, depending on the costs of exploration" . This problem is related to the famous 
"effort-accuracy" trade-off related to exploration costs and the quality of the decision (Payne 
and al., 1998; Busemeyer and Diederich, 2000). To save space, we have focussed our attention 
on the dynamic trade-off which balances advantages to move g{y) — g{x) and costs to move 
c{x,y). 

We have considered the simplest case of "high local costs to move" with an exploration 

set of given size, where r(.T„) = r > 0. The "clairevoyance theorem" shows that, in finite 
time, the exploration set becomes a "clairevoyance ball" where the agent can optimize 
locally. In that case, the convergence of "Local Search and Proximal Algorithm" is a straight 



26 



consequence of our previous results, the agent reaches a behavioral rest point x* X where 
he prefers to stay than to move, setting r{x*) = 0. 

We show that the "Local Search and Proximal Algorithm" still enjoys the convergence 
properties in the case of low local costs to move. 

Theorem: Convergence of the LSP Algorithm v^rith Low Local Costs to Move 

Let X = MP be equipped with the Euclidean distance d{x,y) — \\x — y\\ = (X)r=i('^i ~ 
. Let C (Z X be a closed convex nonempty subset of X (set of constraints, resources). 
Let g(.) : X Cz X i — > g{x) G M &e a function (gain, utility) which satisfies i), ii) and Hi): 

i) g is upper bounded on C, let g < +00 be the supremum of g on C ; 

ii) g is a smooth function (continuously differentiable) ; 
Hi) g is quasi-concave (with convex upper level sets). 

Given some initial data xq G X , let (a;„)„gN be a sequence defined by the Local Search and 
Proximal Algorithm with clairvoyance radius r > and parameter 6 > 0; 

Xn+i<Z argmax{g{y)-9\\xn-y\\'^ ■■ y & C, \\y - Xn\\ < r} . (47) 

Then, the sequence (a;„)„gN converges in X to some Xoo which is a critical point of g over 
C , namely 

-"^ g {x 00) + Nc{x 00) 3 

where Nc (xoo) is the (outward) normal cone to C at Xoo- 

Proof: By taking y = Xn in Eq. (j47p of it is worthwhile to move from Xn to Xn+i- 

g{Xn+l) - 9{Xn) > 0\\Xn+l " XnW^ ■ (48) 

Summing up these inequalities and using i): 

oo ^ 

\\Xn+l - Xnf < ^(5 - 9(xo)) < +00. (49) 

1=0 

As a consequence 

Xn+i — Xn — > as n — > +00. (50) 

Hence, for n large enough, — Xn\\ < r, which implies that the supremum in Eq. (j47p 

is actually achieved in the interior of the ball B(a;„,r) with center Xn and radius r > 0. 
When writing the optimality conditions, for n large enough, this exploration constraint is 
not active and 

- VgiXn+l) + NciXn+l) + 20{Xn+l - Xn) 3 0. (51) 

The convergence of the sequence of values {g{xn))neN is also a direct consequence of Eq. 
(|48p . It is an increasing upper-bounded sequence. We set 

goo = lim„^+oo gixn)- 
To prove the convergence of the sequence (a;„)„gN we introduce the set 

S ^ {x eC : g{x) > goo} 

and prove that: 

a) For every a e S', lim„^+oo \\xn — a|P exists. 

b) Every limit point of the sequence (a;„)„gN belongs to S. 
Indeed, by convexity of the norm 

\\xn - a\f - \\xn+i - a|p > 2 {Xn+l ~ a,Xn- Xn+l) ■ (52) 
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By Eq. ([5T|) there exists some ^„ £ Nc{xn+i) such that 

Xn - a^n+i = -^(-V5(a;„+i) + C„)- (53) 
Combining Eq. ^ and Eq. ([13]) 

\\xn - a|P - - a|P > 7: (a;„+i - a, -\/g{xn+i) + Cn) ■ (54) 

u 

We use the quasi-concavity assumption on g in order to prove that 

{xn+i ~ a,-\7g{xn+i) + S,„) >0. (55) 
To that end we consider the set 

Dn ^ {x e C : g{x) > g{xn+i)} 

which is an upper level set over C of the function g. Because of the quasi-concavity of g and 
of the convexity of C this a closed convex subset of X. By a classical geometrical argument 
(Rockafellar and Wets, 2004, ch. 10) 

ND„ix„+i) = -\7g{xn+i) + Nc{xn+i). (56) 

Hence, 

- V.g(a;„+i) + C„ e No^ixn+i) (57) 



and as a e C -D„ (recaU that g{a) > g{xn+i)) we obtain Eq. ((55| . Returning to Eq. (|52|) 
we obtain that \\xn — a|P is a decreasing sequence, hence converges, which proves point a). 
Concerning point b), recall that goo = lim„^+oo g{xn)- As g is continuous, any limit point 
x* of the sequence {xn)n&i satisfies g{x*) = goo- As C is a closed set and Xn € C for all 
n G N we still have that a;* £ C at the limit. These two results imply x* G S, which is point 
b). 

The Opial argument consists of deducing that the whole sequence {xn)neN converges 
from a) and b). This sequence is bounded because lim„_,+oo ||a;„ — ap exists for every 
a G S and S ^ 9. If the sequence (a;„)„gN has two limit points, set 

Xji^ ^ x^ and Xji2 ^ ^^2* (5^) 
By point b), x\ and X2 belong to S. 

Using point a), lim„^+oo ll^^n — a^ilP and lim„^+oo ||a;„ — P ^xist. Subsequently, 

lim (Ikn — a^ilP — ||a;„ — Xnll^) exists (59) 

which after simplification yields 

lim„^+oo (a^n, a:;2 — x^) exists. 

Specializing this result to the two subsequences Xm and x„2 which converge respectively to 
x\ and X2 , 

{xl,xl-xl) = {x*2,x*-xl) (60) 

that is, \\x\ — x'^W'^ = 0. Hence the sequence {xn)n&i has a unique limit point and converges 
in X to some Xoo- 

By passing to the limit on Eq. ([51]) 

- Vg{xn+i) + Nc{Xn+l) + 29{Xn+l -xn)3 (61) 
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and by using Eq. (|50p together with the smoothness of 5 (assumption ii)) and the closedness 
property of the graph of the normal cone mapping x <—>■ Nc{x), we finally obtain that x^o is 
a critical point of g over C, namely 

" V5(xoo) + Ncixoo) 9 0. (62) 



In our revisited "as if hypothesis" , in a context of inertia, most of our behaviors are 
intermediate goal-setting behaviors. It can be seen as a mixture of a local search optimization 
model with a proximal algorithm. It adds an intermediate goal-setting and search process 
of exploration-exploitation and moving. Global optimization becomes a limit case in the 
absence of inertia costs. 



9.2 Comparisons between local proximal and optimization algo- 
rithms 

Local search proximal algorithms were devised to describe real life human behaviors. They 
work on complex state space and deal with radical uncertainty aspects and the physiological, 
psychological, and cognitive limitations of agents. They are also optimization tools. 

1. The question is: do these algorithms provide a realistic description of the dynamical 
and stationary aspects of real life human behaviors? The word algorithm is mis- 
leading because these algorithms do not claim to solve optimization problems! On 
the opposite, because of inertia and frictions which generate costs to change during 
transitions, they help understand how humans can ultimately reach rest points (per- 
manent routines) which correspond to inefhcient outcomes, located far away from the 
optimum. Our algorithm is better viewed as a discrete-time dynamical system de- 
scribing human decision processes when inertia matters and changing entails a cost. 
These algorithms involve the three basic blocks: exploration, transition with iner- 
tia, and goal-setting blocks with the corresponding control parameters. They allow 
us to describe a large spectrum of behaviors, from muddling through behaviors, sat- 
isficing, satisficing and "worthwhile-to-move," to global optimization. Attouch and 
Soubeyran(forthcoming) give several applications of "worthwhile-to-move" behaviors 
and "local search proximal" algorithms. We show how the agent can overcome inertia 
by using adaptive behaviors involving long term goals and short term intermediate 
goals. Realistic decision-making models must take care of the well being of the agents 
during transitions: because of inertia, agents reject transitions with too many inter- 
mediate sacrifices (costs to change and costs to learn how to do a new action). During 
transitions humans ought to survive! Most optimization algorithms do not take this 
aspect into account. 

2. At each step, when using local search proximal algorithms, one has to solve an op- 
timization problem. Replacing a single optimization problem by a sequence of opti- 
mization problems has advantages. In usual proximal algorithms the quadratic cost 
to change is interpreted as a regularization term. The first order optimality condition 
for 

a;„+i e argmax|(/)(?/) - ^||a;„ - y|p : y £ | . (63) 

gives 

J {Xn+l - Xn) - d(f>{Xn+i) 3 0. (64) 

This Eq. (|64p is an implicit discretization of the continuous first order gradient system 

^{t)-dcj^ixit))3 (65) 
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which is the steepest ascent method. This continous dynamical system plays a central 
role in optimization, differential geometry, and physics. 

Proximal algorithms share most of the large time convergence properties of this dy- 
namical system. Convergence properties were established in the case of a concave 
upper semicontinous function (f) (Rockafellar, 1976), and in a non convex setting by 
using the Kurdyka-Lojasiewicz inequality, which is valid for a large class of possibly 
non-smooth functions including real analytic or semialgebraic functions (Attouch and 
Bolte, 2009). These convergence properties hold even if the initial optimization prob- 
lem is ill posed with a continuum of solutions. The algorithm asymptotically selects a 
particular one, depending on the initialization. 

The definition of proximal dynamics still makes sense in spaces without differentiable 
structures, where the norm is replaced by some metric or relative entropy. It allows us 

to define steepest ascent dynamics in metric spaces (Ambrosio et al., 2005, for gradient 
flows in the space of probability measures and applications to the Monge-Kantorovich 
optimal transport problem). 

These algorithms also allow the general decomposition or splitting results for struc- 
tured optimization problems. This is a key property in order to obtain convergence of 
best reply dynamics for potential games (Attouch et al. 2007, 2008). 

3. Our algorithm performs better or worse than some other algorithms depending on the 
context. Let us briefly compare it with the simulated annealing algorithm, which is 
a widely used local search optimizing algorithm (a computer context). An extensive 
comparison is available in Attouch and Soubeyran (2008, working paper). Simulated 
annealing aims at reaching an optimum of an unknown function on a known and finite 
state space, after a finite number of steps. Before any local exploration, it uses a given 
neighbour structure of search, a given probabilistic generation rule for new actions 
and a probabilistic acceptance rule, either for an improving or, "from time to time," a 
worsering action. In a human context, local search proximal algorithms offer several 
improvements with realistic features: 

(a) the state space can be infinite, and even non compact; 

(b) the agent does not know the geometry of the state space ex ante, hence the content 
of any exploration set before doing exploration. He cannot ex ante determine the 
probability to pick a neighbor action, which forbids him to use a probabilistic 

process; 

(c) the agent can have a semicontinuous non differentiable long term objective; 

(d) the agent has not only a long term objective, but also a short term one, which 
avoid him too many temporary sacrifices during the transition (because of costs 
to change): during a transition the agent ought to survive. 

(e) the agent can have a more goal-oriented objective than the willingness to improve 
his situation (a "muddling through" behavior). He can set intermediate objec- 
tives, such as temporary satisficing ("improving enough"). This allows us to 
model the "intermediate goal-setting processes" of Vroom (1964), and the "hard 
goal effect" of Locke and Latham (1990). 

(f ) the context includes inertia through costs to change, attention, exploration, learn- 
ing, switching, and adaptation costs. 

(g) the context leads to consider learning costs as costs to know how a do a new 
action (a major case of inertia), through coercition or good willingness. 
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(h) the exploration process is not necessarily local. The exploration can proceed by 
visiting neighbors of neighbors. Indeed, an agent can start with large steps, and 

end with small steps. 

(i) the exploration process adapts the temporary satisficing process to the amount 
of exploration which is to be done at each step. 

(j) the decision-making process is rather realistic in a human context. The agent is 
not supposed to compute sophisticated mathematical tools to take a decision. 

(k) the algorithm escapes from local extrema. The agent can do large and decreasing 
steps at the very beginning and afterwards from time to time. 

10 Conclusion 

Our dynamical heuristic model concerns behaviors, defined as a successions of decisions and 
actions. An action is a move in a decision space X. We examine why agents do something 
or not, and how they choose to do it. They can choose to do nothing, to keep their way of 
doing, or to innovate. The reference is "what and how" things where done previously. Our 
model involves six interrelated blocks: 

1. Incrementalism. The "worthwhile-to-move" or moving block 

y G W{x) = {y€X: A{x, y) > ^{x)C{x, y)} (66) 

with frictions and inertia models the transition from a temporary routine x to a new one 
y. The "worthwhile-to-movc" principle says that a move is acceptable if his estimated 
advantages to move A{x,y) arc higher than some proportion 1 > ^(x) > of his expected 
costs to move. The dynamic of change follows an acceptable transition path, where short- 
term sacrifices are sufficiently few. 

2. Instrumentalism. The motivation building and goal-setting block 

yel{x, e{x)] ^{yeX: g{y) - g{x) > e{x) > 0} C X (67) 

includes improving, "improving enough," and intermediate satisficing processes with not too 
much sacrificing [g is an instantaneous utility function). 

3. Local exploration and search. A basic ingredient is the local exploration block 

yGE{x,r{x))cX, (68) 

with r{x) being equal to the radius of the exploration set around x C X. 

4. Heuristics. By using parsimonious heuristics the agent localizes, encloses the process, 
and cuts the cost regression paradox (to "know how to know how..."). In the context 
of radical uncertainty, we do not use probabilities, but rather set membership, which is 
identified by inequalities to better model the degree of flexibility, fuzziness, and adaptability 
of a behavior. 

5. Punctuated dynamic. Transitions matter, because they are necessary to reach the 
goal (a succession of jumps from a temporary routine to another one). "Walking on the 
road" gradually becomes as important as reaching the goal itself. This irregular dynamic 
articulates decisions and actions along static phases of exploration-exploitation and dynamic 

moving phases. 

6. Physical, physiological, psychological, cognitive, and social features. We model many 
behaviors when there is a lack of knowledge (knowledge acquisition costs) and frictions 
(goal-setting costs and costs of moving), (Huitt, 1992). 

Our model links together complementary "motivation building" (goal setting), "explo- 
ration around," and "moving" (changing, learning) tasks, which are the basic blocks of the 
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"adaptive decision-making" behaviors. Incremental processes of decision and making are 
described as a succession of comparisons between state-dependent intermediate behavioral 
advantages and costs to change, with respect to intermediate advantages and costs to stay. 

We characterized incremental behavior as a "worthwhile-to-move" behavior. At each 
step, intermediate advantages to move are required to be higher than some fraction of the 
costs to move to bound intermediate sacrifices. These apparently inefficient behaviors can 
be rationalized as sparing behaviors. Incremental decision-making processes ("muddling 
through" ) represent the simplest case where the motivation-and-goal process is to do better 
than before, which limits exploration and moving. Satisficing processes represent more 
goal-oriented "worthwhile-to-move" behaviors. 

We rationalized incremental behaviors in a general context including lack of knowledge 
(costs of exploration), lack of motivation (costs to build or sustain motivation, goal-setting 
costs), inertia and friction during the transition (costs to change, costs to learn). 

We need only a metric space, there is no need of vectors. Utility is upper-bounded 
to reflect limited resources. The agent encloses his "worthwhile-to-move relationship" in 
putting bounds to his control variables, in choosing not too long an exploitation period at 
each step, in using a minimum effort to move (for example when the speed of moving is 
not too low and the per unit of time effort to move increases with the speed of moving), 
in taking not too low a sacrifice index and putting not too heavy weights over temporary 
satisfaction and disapointment (which amounts to high local costs to move). 

Under such assumptions, by following a "worthwhile-to-move" process, the agent avoids 
intermediate sacrifices during the transitions. This leads him to local actions in an endoge- 
nous way: local action is no longer an hypothesis as it is for hill climbing algorithms, it is 
now a consequence of inertia and frictions. Specifically, 

• a) For a low goal-oricntcd behavior, where the agent just wants to improve step by 
step by following the "worthwhile-to-move" relationship, the dynamical process has the 
local action property. It is nested and, when the state space is complete, it converges 

to some final state. Furthermore, if the agent spends a finite time for exploitation and 
the speed of moving is high enough, it converges in a finite number of steps. 

• b) When the behavior is more goal-oriented, starting from a given initial state, the 
process converges to a rest point, where the agent prefers to stay than to move on. 
Habits and routines represent specific examples of rest points where an agent no longer 
needs to think before acting. 

In this context of friction, our model helps give a qualitative answer to the question: 
"how far from optimizing do agents behave in real life, depending on the context"? Our 
model helps calibrate the size of the inefficiencies (the departure of a given behavior with 
respect to its substantive formulation as an optimization model). It gives us a tool to know 
when the "as if hypothesis" is indicated, when optimization is a good enough approximation 
of a given behavior. To this purpose our model must 

i) specify how to catch behavioral inefficiencies, how to model lack of knowledge, frictions 
and goal-setting inefficiencies and efficiencies. This problem becomes more complicated when 
the model involves costs to move; 

ii) define dynamic inefficiency indices, which amounts to calibrate the inefficiency gaps 
of a behavior with respect to its substantive formulation; 

iii) link the inefficiency gaps of a behavior to the characteristics of both the agent and 
his environment. 

Concerning costs to move, inefficiencies indices also include the total costs to move, the 
mean velocity, and the intermediate sacrifices. Several quality-cost ratios were defined. 

At a mathematical level, this led us to revisit the "as if hypothesis" and put to the fore 
the "local search and proximal algorithms" which mix local search and proximal algorithms. 
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